Asynchronous iterative solution for dominant eigenvectors with applications in performance modelling and PageRank

نویسنده

  • Douglas Vincent de Jager
چکیده

Performance analysis calculations, for models of any complexity, require a distributed computation effort that can easily occupy a large compute cluster for many days. Producing a simple steady-state measure involves an enormous dominant eigenvector calculation, with even modest performance models having upwards of 10 variables. Computations such as passage-time analysis are an order of magnitude more difficult, producing many hundreds of repeated linear system calculations. As models describe greater concurrency, so the state space of the model increases and with it the magnitude of any performance analysis problem that may be being attempted as well. The PageRank algorithm is used by Google to measure the relative importance of web pages. It does this by formulating and solving a similarly enormous dominant eigenvector problem, with one variable for every page on the web. As with performance problems, as the number of web pages grows, so the size of the underlying system calculation grows also. With the number of web pages currently estimated to exceed 1 trillion, the PageRank problem requires many thousands of computers running concurrently over many different clusters. Both problems share the same underlying mathematical type and also the same requirement to run effectively on large distributed clusters. Traditional iterative solution methods scale poorly over large distributed architectures. This is because of the inherent requirement to communicate and synchronise at every iteration step. While asynchronous iterative methods have been around since the 1950s, they have, as yet, not been applied to dominant eigenvector problems without some form of restriction. These methods have been shown to be very successful in other contexts when implemented across large distributed architectures. According to the current state of the art in asynchronous techniques, application to dominant eigenvector problems requires a fixed bound on how and when updates can happen, and thus effectively a bound on the asynchronous communication itself. In this thesis, we show how to apply asynchronous iterative methods to dominant eigenvector problems without any such restrictions. We do this by showing how to map homogeneous, singular linear systems to inhomogeneous, non-singular linear systems which share the same solution. We present a single asynchronous iterative solution framework for performance analysis problems. We also present three particular solution algorithms. We demonstrate analytically and empirically that asynchronous iterative methods offer significant advantages over traditional synchronous solution methods. We use the theoretical tools which we introduce in this thesis to reduce the complexity of the PageRank problem, limiting the ever-increasing impact of dangling web pages. We generate a smaller, sparser problem which may be solved using asynchronous iterative methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of London Imperial College London Department of Computing Asynchronous Iterative Solution for Dominant Eigenvectors with Applications in Performance Modelling and PageRank

Performance analysis calculations, for models of any complexity, require a distributed computation effort that can easily occupy a large compute cluster for many days. Producing a simple steady-state measure involves an enormous dominant eigenvector calculation, with even modest performance models having upwards of 10 variables. Computations such as passage-time analysis are an order of magnitu...

متن کامل

Asynchronous iterative methods for the effective computation of PageRank

Iterative algorithms are the building blocks of important scientific computations. However their semantics-preserving implementation over modern distributed computing platforms introduces synchronization phases between cooperating tasks. These phases increase overall idle time and put tight upper bounds to performance. A drastic measure would be a total elimination of these phases: Each process...

متن کامل

New iterative methods with seventh-order convergence for solving nonlinear equations

In this paper, seventh-order iterative methods for the solution ofnonlinear equations are presented. The new iterative methods are developed byusing weight function method and using an approximation for the last derivative,which reduces the required number of functional evaluations per step. Severalexamples are given to illustrate the eciency and the performance of the newiterative methods.

متن کامل

The Evaluation of the Team Performance of MLB Applying PageRank Algorithm

Background. There is a weakness that the win-loss ranking model in the MLB now is calculated based on the result of a win-loss game, so we assume that a ranking system considering the opponent’s team performance is necessary. Objectives. This study aims to suggest the PageRank algorithm to complement the problem with ranking calculated with winning ratio in calculating team ranking of US MLB. ...

متن کامل

A modified Mann iterative scheme for a sequence of‎ ‎nonexpansive mappings and a monotone mapping with applications

‎In a real Hilbert space‎, ‎an iterative scheme is considered to‎ ‎obtain strong convergence which is an essential tool to find a‎ ‎common fixed point for a countable family of nonexpansive mappings‎ ‎and the solution of a variational inequality problem governed by a‎ ‎monotone mapping‎. ‎In this paper‎, ‎we give a procedure which results‎ ‎in developing Shehu's result to solve equilibrium prob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009